Problem Note 46058: The Text Parsing node term frequency calculation might be incorrect on some Japanese terms
When you parse Japanese documents using the SAS® Text Miner Text Parsing node, the reported frequency totals might be incorrect. The incorrect totals are limited to katakana terms that are listed as the first word in a sentence.
To work around the problem, ensure that no katakana terms appear as the first word in a sentence.
Operating System and Release Information
SAS System | SAS Text Miner | Microsoft® Windows® for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2003 Datacenter Edition | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2003 Enterprise Edition | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2003 Standard Edition | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2003 for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2008 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows Server 2008 for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Microsoft Windows XP Professional | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Enterprise 32 bit | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Enterprise x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Home Premium 32 bit | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Home Premium x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Professional 32 bit | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Professional x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Ultimate 32 bit | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows 7 Ultimate x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows Vista | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Windows Vista for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
64-bit Enabled AIX | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
64-bit Enabled Solaris | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
HP-UX IPF | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Linux for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
Solaris for x64 | 4.2_M1 | 5.1 | 9.2 TS2M3 | 9.3 TS1M0 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Text Parsing node term frequency calculation might be incorrect on some Japanese terms
Type: | Problem Note |
Priority: | high |
Topic: | Analytics ==> Data Mining Analytics ==> Text Mining
|
Date Modified: | 2013-02-21 12:49:04 |
Date Created: | 2012-03-19 23:22:37 |